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Current analyses of the Lyman-alpha forest assume that the primordial power spectrum of 
density perturbations obeys a simple power law, a strong theoretical assumption which should 
(~| I be tested. Employing a large suite of numerical simulations which drop this assumption, we 

Oj. reconstruct the shape of the primordial power spectrum using Lyman-alpha data from the 

Sloan Digital Sky Survey (SDSS). Our method combines a minimally parametric framework 
with cross-validation, a technique used to avoid over-fitting the data. Future work will involve 
predictions for the upcoming Baryon Oscillation Sky Survey (BOSS), which will provide new 
^ ' Lyman-alpha data with vastly decreased statistical errors. 



^ ' 1 Introduction 

o\ 

f~^ , The Lyman-a forest is the name given to a series of absorption hnes in quasar spectra, caused 

■^ I by the scattering of photons via interaction with neutral hydrogen at redshifts 2 — 4. At these 

lO ' redshifts, a large proportion of the baryon density of the universe is contained within hydrogen 

^? , clouds. Most of the hydrogen is ionized, but a small fraction remains neutral, and absorbs 

photons via the Lyman-a transition. Hence, the Lyman-a forest is sensitive to the matter 
power spectrum on scales from a few up to tens of Mpc, making it the only currently available 
probe of fluctuations at these weakly non-linear scales. A number of authors have examined the 
rS I constraints obtainable from the Lyman-a forest in the past, including Croft et al^, Gnedin Sz 

Hamilton'^, Viel, Haehnelt & Springel'^ . 

Previous analyses of constraints from the Lyman-a forest have assumed that the primordial 
power spectrum is described by a nearly scale-invariant power law. This deserves further at- 
tention for a number of reasons. First, it is a strong assumption; if the data are inconsistent 
with it, derived constraints could be biased to some extent. Second, it is a generic prediction of 
inflationary models; hence, any test of a power law primordial power spectrum which cannot be 
attributed to data systematics is a test of inflation. Third, of all current datasets, the Lyman- 
a constrains the smallest cosmological scales; thus, it provides the best opportunity presently 
available to understand the overall shape of the power spectrum. To do this, we shall recon- 
struct the primordial power spectrum in a minimally parametric way, using a technique called 
cross-validation to robustly recover the signal. If the data are in agreement with theoretical 
expectations, the recovered power spectrum will be nearly scale-invariant. In these Proceedings, 
we discuss a minimally parametric framework for constraining the primordial matter power 
spectrum, the cross-validation technique, and the methodology for obtaining constraints from 
observations. Finally, some preliminary results are presented. 



2 Flux Power Spectrum 

In the case of Lyman-a, the observable is not a direct measurement of the clustering properties 
of tracer objects, as in galaxy clustering, but the statistics of absorption along a number of 
quasar sightlines. It is easiest to work with the statistics of the flux, J-", defined as 

J' = exp(-r), (1) 

where r is the optical depth. The primary observable here is the one dimensional flux power 
spectrum, Pp, 

Pp{k) = \T{k)\', (2) 

where J-' is the Fourier transform of the flux, evaluated as a function of distance along the line 
of sight, 

T{k)= j F{x)e'''''dx. (3) 

Current constraints on Pp are given by McDonald et all^, determined from r^ 3000 SDSS quasar 
spectra. 

In order to simulate the observable flux power spectrum from a given set of primordial fluc- 
tuations, a large A^-body simulation is required. This makes it impractical to directly calculate 
Pp for every possible set of input parameters; instead simulations are run for a representative 
sample. Other results are obtained via interpolation, using the following scheme of Viel & 
Haehnelt'^. The flux power spectrum is assumed to be given by a Taylor expansion around 
some best-fit model. For a vector of parameters pi, with best-fit model parameters p^, the flux 
power spectrum Pp is given by 



Pf{p,) = FFip") + Site -p'l)^ + Si(p. - plf^ . (4) 



opi dpf 

Numerical simulations are used to calculate the derivatives of the flux power spectrum. Each 
parameter is varied independently, and the total change in the flux power spectrum is assumed 
to be a linear combination of the change due to each parameter, i.e., 

5Pp=^-^6p, + ^-^5p, + .... (5) 

dpi dp2 

Figure [T] shows the error due to this approximation for a sample input primordial power spec- 
trum. The error is around 1% on scales probed by current Lyman-a data {k = 0.4 — 2h Mpc~ ), 
which is a small contribution to the total error, allowing us to proceed with confidence. Further 
checks on interpolation errors are in progress, and are expected to give similar results. 

3 Power Spectrum Reconstruction 

Previous analyses of the Lyman-a forest'^^'^ have assumed the primordial power spectrum is a 
nearly scale-invariant power law, of the form 

PiM=A.{-) . (6) 

As discussed above, we seek to test whether the data supports this assumption by reconstructing 
the power spectrum with smoothing splines Q, as proposed in Sealfon et aV^. Smoothing splines 



"Splines are piecewise cubic polynomials with globally continuous first and second derivatives, completely 
specified by their values at a series of knots, where the polynomials meet. 
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Figure 1: The difference between the flux power spectrum as obtained from interpolation, using Eq. |4l and 

directly by simulation. Each line represents simulation output at a different redshift bin, between z = 2.0 and 

z — 4.2. Red dots show the positions of spline knots. The grey band shows 1% error bars. 



are used because they have good continuity properties and are particularly suited to formulation 
of a cross-validation penalty. 



4 Cross- Validation 

Any minimally parametric formalism, when applied to noisy data, runs the risk of over-fitting 
the data. One way to avoid this problem is a technique called cross-validation, described in 
Peiris Sz Verdel^. This technique assumes that noise in the data takes the form of additional 
small-scale structure, and thus power spectra with superfluous fluctuations should be penalised. 
This penalty is implemented by adding an extra term to the likelihood function, C; 



log C = log £(Data|P(A;)) + A / dk{P" {k)f 

Jk 



(7) 



Here A, the penalty weight, is a free parameter. In the limit A — )■ oo this likelihood becomes 
functionally identical to linear regression, while A ^> is appropriate in the case of noiseless 
data. In order to determine the optimal value for A, the data points are first divided into two 
sets, the training set, or CVl, and the validation set, or CV2. CVl and CV2 are composed of 
alternating data bins. Next, to calculate the CV score, a value is chosen for A, and the best fit 
power spectrum based on the CVl dataset is found. The x^ is then calculated for this power 
spectrum with the CV2 dataset. This is repeated, replacing CVl with CV2 and vice versa, and 
the CV score is the sum of both x^ values. 

The key to cross-validation is that signal in the CVl dataset will predict signal in the CV2 
dataset well, while noise in CVl will predict noise in CV2 poorly. The optimal choice of A is 
therefore the one which allows maximal predictivity between CVl and CV2; in other words, 
minimizes the CV score. 



5 Results 

We performed a large grid of A^-body simulations using Gadget-Il'^ . Con verg ence checks were 
carried out to ensure Pp was not significantly affected by simulation settingsL!^ , such as particle 
resolution or box size. Initial conditions included a variety of input power spectra, on scales 
ranging from k = 0.45 — 2 /iMpc~ . 

A significant departure from a power law primordial power spectrum translates to a de- 
tectable feature in the flux power spectrum, which is more noticeable at higher redshifts. This 
is due to the way in which the matter power spectrum evolves: a feature in the matter power 
spectrum will create extra non-linear growth on smaller scales, making the feature in Pp stand 
out less. The results of the simulations provide a mapping between primordial and flux power 
spectra, which in turn provides a likelihood function for any given primordial power spectrum 
from SDSS data. The full data analysis, including cross-validation, is currently being carried 
out. 

6 Future Prospects 

The best constraints on the flux power spectrum currently come from the Sloan Digital Sky 
Survey (SDSS"^), which contains ~ 3000 quasar sightlines. In the near future, better constraints 
will be available from the Baryon Acoustic Oscillation Sky Survey (BOSS ''■), part of SDSS- 
III. This will contain 160000 quasar spectra between redshifts of 2.2 and 3, and should further 
increase the statistical power of the Lyman-a forest. We plan to make forecasts for BOSS in 
forthcoming work^L^. 
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